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ABSTRACT 



Pnhi,- ^ 2 the spring of 1989, the North Carolina Department of 
Public Instruction field-tested a geometry proof performance 
assessment as a component of the High School End-of-Course Testinq 
Program (E-o-C) . The existing geometry E-o-C test consisted only 
of multiple-choice items. The performance assessment was added to 
the multiple-choice component to form a more "authentic" assessment 
of student performance. Of primary concern in this study were the 
reliability of the scoring process and the cost of adding a performance 
assessment to the existing geometry test. 

Th^ findings indicated a high degree of consistency between 
tne ratings assigned by two readers when perfect and adjacent agreement 
analyzed. The cost of conducting the performance assessment 
1990) " estimated at $3.00 per student ($2.44 for SY 

The educational significance of using thisprocess for developinq 
authentic assessment strategies is discussed. 
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The Reliability of Using a Focused-Holistic 
Sa>riDg Approach to Measure Student 
Performance on a Geometiy Proof 



Introduction 

While Webb and Romberg (1988) were presenting the new standards for 
mathematics assessment adopted by the National Council of Teachers of 
Mathematics (NCMT) at the New Orleans meeting of the AERA state and local 
testing departments were already putting together mathematics performance 
tasks to assess student knowledge beyond the realm of the multiple-choice test. 
The NCMT proposed that assessment be appropriate and meaningful in 
facilitating mathematical communication among students. Webb and Romberg 
(1988) provided examples of age/experience appropriate innovative assessment 
tasks that emphasized critical thirJdng and problem-solving. 

As with the NCMT, Wiggins (1989) argues for "authentic" assessment to 
enable educators to have better knowledge of student ability in areas not amenable 
to multiple-choice assessment techniques. Wiggins (1989) states: 

Do we judge our students to be deficient in writing, 
speaking, listening, artistic creation, finding and citing 
evidence and problem solving? Then let the tests ask them 
to write, speak, listen, create, do original research and 
solve problems. Only then need we worry about scoring 
the performance, training the judges and adapting the 
school calendar to assure through analysis and usefiil 
feedback to students about results. 

The North Carohna Department of Public Instruction has endeavored to 
translate the educational reform mandate legislated by state representatives into 
more meaningful assessment activities. The North Carolina End-of-Course 
(EOC) Testing Program at the secondary level is an outgrowth of the desire of state 
legislators to standardize the statewide course of study and the basic educational 
program offerings to the one hundred thirty-four (134) public school systems in 
North Carohna. A set of common, or core, items on the EOC tests are used to 
compare student performance across school systems. School systems are 
encouraged to use the EOr* test scores on the core items as a factor in assigning 
final course grades for stuuents. Additional items are assessed through those 
tests for use in evaluating the extent to which school systems are implementing 
the state-mandated c rriculum goals and objectives in each subject a^ea. For 
example, each Algebra I student takes a 100-item test, of which 60 items are 
common to all test forms and 40 items represent one of five forms. Therefore, 260 
items are measured in each classroom. Some fourteen EOC tests will be put into 
place by School Year (SY) 1992. Presently, multiple-choice EOC tests have been 
implemented in Algebra I, biology. Algebra II, U.S. History, geometry, 
chemistry, physics and Enghsh I. When implemented in SY 1992, the English II 
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assessment vdll have students write essays, some of which will be literature- 
based. 



During 1988, NCDPI field-tested approximately 1,200 multiple-choice 
geometry test items which measured the mandated standard course of study 
curriculum goals and objectives (see Appendix A). Eight of the fourteen 
geometry curriculum goal areas include instruction in developing complete 
proofs. Traditionally, instruction in proofs has been considered an important 
objective in the high school curricultim for its focus on the development of logical 
c.nd precise thinking skills. The Mathematic and Testing Sections of NCDPI 
determined that the best way to measure student ability to develop proofs is to have 
the students formulate actual proofs during EOC testing and to have the proofs 
scored on a common scale. Teachers and curriculxmi specialists advised that the 
geometry EOC test would have greater face and content validity if it also contained 
proofs. The item, field test administered in 1988 therefore contained 20 proofs, two 
each on the ten test forms. A Geometry Advisory Group (GAG) composed of 
geometry teachers, school system mathematics supervisors, college mathematics 
teachers, and NCDPI staff was formed to provide guidance and feedback to the 
Testing and Mathematics Sections on the development of a proofs assessment. 

After developing a scoring process for the proofs it was decided to determine 
the feasibility and reliability of a proofs assessment during a statewide field test 
dmingSY 1989. The statewide field test not only afforded an opportunity to assess 
the administration and scoring of geometry proofs, but also an opportunity for 
statewide staff development and awareness of the proposed measurement 
process. Although most EOC tests are administered at the end of the school year, 
the proofs were administered during the spring, and each student developed two 
proofs, one common proof and one of four variable proofs, so that five proofs were 
administered in each classroom. 



Olgectives 

The goals of the statewide field-test were to determine the feasibility of 
adding a geometry proof to the geometry EOC examination and to determine the 
reliability of scoring geometry proof performance exercises using the focused- 
holistic scoring approach. Specifically, this study sought to answer the following 
questions: 



1. What is the agreement rate for two independent readings of geometry 
proofs? Does the rate vary by type or difficulty of proof? Does the rate 
vary by scoring location? 

2. What is the reliability of proof ratings and proof scores? 

3. What are the relationships (predictive validity) between proof scores and 
geometry grades, geometry proof grades, a multiple-choice proofs test, 
and a multiple-choice geometry test? 

4. What is the cost of a geometry proofs performance assessment? 
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literature Review 

One of the measurement issues arising during the educational reform 
movement of the late 70*s/early 80*s has been the most appropriate method for 
assessing student ability on authentic performance tasks. Most research on this 
topic has been done in the area of writing. Most educators have preferred to 
collect and analyze writing samples as autlientic measures of writing ability, but 
the methodology for reliably assessing the resulting writing samples has raised 
many questions yet to be resolved. 

Rating procedures for assessing direct writing samples are plagued by 
many sources of error including less than desirable scorer reliability Some 
educators have sought to circumvent problems associated with direct assessr ant 
by using an indirect approach to measure writing ability: the language 
expression, total language, or reading vocabulary sections of multiple-choice 
examinations. While multiple-choice tests generally possess high levels of 
rehabihty and validity based, in part, on their conformity to traditional 
measurement techniques, others believe that direct measures of writing are more 
concrete and valid indicators of writing ability (in the particular writing domain 
chosen). The next section will identify some of the issues related to scoring direct 
portormance measures through the research on writing assessment. 

Spandel (1981) has summarized many of the issues dealing with methods of 
rating writing samples and the number of readings required for each The four 
most commonly used api roaches for rating writing samples are holistic, focused- 
hohstic, analytic and pnmary-trait procedures. The holistic rating procedure 
involves assessirig a piece of writing to get an "overall" impression of its merits 
based on a set of predetermined criteria. A range of factors are considered in 
definmg the cntena or overall quality of the writing sample. The focused-hoHstic 
approach uses a specific, selected number of predetermined writing 
characteristics that are selected for "focus" to provide an overall assessment of the 
quality of the sample b. ad on the selected domain. The key difference in the 
holistic versus the focusjd-hoHstic approaches is the number of characteristics 
and specificity of criteria included under the "overall" umbrella. The analytic 
approach according to Spandel (1981) involves isolating one or more of the pred- 
dehned charactenstics of writing and rating each independently. Analytic 
procedures enable the rater to assess the students* ability to perform the specific 
skills ot wnting (e.g., grammar, punctuation, organization and/or style). The 
pnmary-trait procedure is similar to analytic in that attention is directed to 
specific characteristics of a writing product; howev ,r, this procedure endeavors to 
quantity the amount of the characteristic present in determining the appropriate 
rating to be assigned. k' ^ 

The process of rating a writing sample varies with the scoring approach 
used (Spandel, 1981). With hohstic or focused-holistic approaches, a single score 
IS assigned to a writing sample based c the overall impression of the rater using 
predetermined cnteria and performance levels after ans. reading. Thirty to forty 
papers can be read in one hour using this approach. In analytic and primary- 
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trait procedures, each factor is evaluated independently of other factors. If four 
factors were being considered, every essay would need to be read four times so that 
the strength of each factor could be judged independently. 

Spandel (1981) concluded that the benefits of the rating method used 
depends on the puipose for evaluating student writing. If the goal is to identify 
overall quality of writing, then a holistic measure is the strongest indicator of 
abiUty in the writing domain chosen. Ratings from analytic and primary-trait 
approaches are more useful in directing instruction." 

The most comprehensive review of research on standardized systematic 
assessment of writing abihty was conducted by Cooper (1984) for the Educational 
Testing Service (ETS). Cooper's thorough review considered the nature and 
limitations of essay and multiple-choice tests of writing ability, the statistical 
relationships of those types of tests, disaggregate performance indicators and the 
comparative cost effectiveness of various types of writing assessment. 

Cooper (1984) indicated that direct measures of writing are subject to lower 
reliability than are indirect measures because of the subjective nature of the 
ratings and the procedures used to assess writing samples. Conventional 
statistical approaches may fail to disclose differences based on changes in factors 
such as the rating standards used, flatness of the writing sample and time/order 
of rating. Essay ratings independently assigned by two to four readers (ratj^rs) 
were considered to be more reliable than those assigned by a single rater. Any 
rating assigned is only as good as the training provided to the raters and the 
strength of the guides used to anchor score points. 

Cooper (1984) reported that the reliabiHty of ratings assigned by 
experienced, rested (fresh) raters is subject to variation from one rating to the 
next depending on the quality of writing being scored. Errors are more likely to be 
counted in poor rather than more skillfully produced essays. Interesting essays 
are more likely to receive higher ratings than minimally adequate but 
uninteresting essays. Further, the quality of preceding essays is likely to impact 
on the ratings assigned to subsequent essays. Fatigue also impacts on scores 
assigned. A tired rater can become more lenient, stricter or eiratic in the scores 
assigned for a given essay dependintr on the level of fatigue experienced. The 
length of the reading period across a day or number of days can result in lower 
scores. Ratings done on the first day of a multiple period of readings tend to be 
higher than ratings assigned towards the end of a multiple day reading period 
(Cooper, 1984). 

A study of interrater agreement by Myers, McConville and Coffman (1966) 
across five days of ratings indicated that while the average daily rating 
correlations for readers across all papers over five days was .406, the average 
correlation on the fifth day was only .264. Among the conclusions of Myers et al. 
(1966) is that lack of vigilance exists as thr- end of any arduous task approaches no 
matter how little time has been allocated to the scoring process. More recently. 
North CaroHna and other states have fcund that reader agreement increases over 
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time, testifying to the effectiveness of the monitoring of reader reliability that is 
common in statewide assessments. 

Conlan (1980) indicated that fo. ' any reading of essays, some effort must be 
made to control variables such as the number and length of rest breaks; ru^es for 
off-topic papers; and a system for handling unique or emotionally evocative 
papers. 

Cooper (1984) and Breland et al. (1987) reported that reHability estimates of 
essay scores increase with the number of topics and/or readings per topic "If 
each essay receives two independent readings, the rater reliabilities are 65 for 
three topics, .55 for two topics and .38 for one topic (the latt-er being the most 
common).'' Score reliabilities for single topics read one time ranged from 361 to 
V.?^ , ®* al. found that adding more topics contributes more to estimated 
rehabihty and predictive validity than adding additional readings. 

/ °^ ^'^^^^^ of performance tasks have been completed by Blok 

(1985), Cooper (1984), Breland and Jones (1982) and others. Blok (1985) studied 
multiple ratings obtained by having different raters and the same raters repeat 
the judging of essays. Testing the theory of rater equivalence, that the "trie" 
scores of one rater wil! correlate perfectly with the true scores of another iter 
was the goal of the Blok (1985) study. Sixteen elementary school teachers rated 
one hundred five (105) essays on a scale from 1 (very poor) to 10 (exceRsnt) using a 
hohstic approacn without providing training to the scorers. A second rating was 
made on the 105 essays by the same 16 teachers three months later. The four 
rater equivalence theories were testing using tests of linearity and the method of 
hnear structural equation model. In terms of the equivalence ratings of the 
essays used in the Blok stuay (1985), the ratings of different raters were essentially 
ditterent measures (with rater correlaiions ranging from .415 - .910). 

The rater variable is considered by Cooper to be a m.ajor and "unique" 
source of measurement error in direct assessment. Cooper ''1984) indicated that 
sconng inconsistencies become more pronounced when dealing with a group of 
readers, toome of the inconsistences are the result of random error while other 
inconsistenaes are systematic based on differences within the groups assigning 
ratings. Breland and Jones (1982) indicated that inexperienced readers assigned 
higher ratings than did experienced readers. Further, even when "experienced" 
iingnsh teachers agree on scoring criteria and standards, they do not agree on the 
extent to which any one of the criteria ought to be applied in any given 
arcumstance even with training. Coffman (1977) added that inexperienced raters 
are reluctant to assign ratmg scores that are too high or too low so th-ir scores 
tend to cluster arouna the middle. As a result, the matter of which r^ter assigns 
a score to a wntmg sample can make a difference in the scores of poor and good 
essays. ^ ^ 

Cooper reported (1984) that " over and over it has been shown that there can 
be wide vanance between the grades given to the same essay Lv two diffprpnt 
readers, or even by the same reader at different times". Coffman (1971) indicated 
that ratings by a single pair of raters may result in excessive overestimations of 
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reliability. This finding differed from Cooper and Odell (1977) who estimated 90% 
reliability between pairs of raters under controlled conditions. They cited rigid 
control of the training protocols as the single most important factor in producing 
consistency (Cooper and Odell, 1977). 

Hughes and Keelmg (1984) studied the effect of offering model essays as 
training devices to eliminate context effects for persons preparing to score writing 
samples. Essays were collected from thirty-eight (38) students on a topic 
determined by the researcher. The essays were typed, retaining the original 
errors, and formed into tventy-five (25) booklets (randomly ordered). Each set of 
essays was scored on a 0-25 point scale by a cadre of experienced English teachers 
using predetermined criteria. The same five good, five poor, one criterion and 
four filler essays were selected for placement in good or poor context booldets. In 
the good context booklet, the five good essays were randomly placed first, next the 
criterion essay and last the filler essays randomly arranged with the poor essays. 
In the poor context booklet, the poor essays were randomly arranged in the first 
section of the booklet and the good essays were randomly arranged at the end of 
the booklet. Three annotated model essays (one good, one average and one poor) 
were used as models for the trained group. A set of five essays were scored by 
each group and served as the covariate group for the study. The findings were 
that context effects existed even after efforts were made to eliminate them. 
Hughes and Keeling (1984) concluded that contextual factors were likely to persist 
in performance assessments where writing samples or non-factual responses 
were being assessed. However, they held out hope that contextual factors would 
be less evident in performance measures that dealt with factual answers. 

The literature presented leads to the following conclusions. Ratings 
assigned to a writing sample by scorers imtrained in the interpretation of the 
criteria are less reliable than ratings assigned by trained scorers. Adequate 
training, frequent rest breaks and pre-established rules for rating on the off-topic 
responses improves the consistency of the ratings assigned by scorers. Rules for 
recalibrating raters and monitoring the consistency of ratings assigned also 
improves the reliability. Some issues, such as the number of raters that should 
read each paper, the best approach for scoring writing samples and the 
appropriate statistical procedures to use in determining reliability/validity, have 
not been resolved to the satisfaction of many "experts". However, it does apriear 
that rule setting and training are key factors required to improve the consistency 
of the ratings assigned t'^ writing samples no matter what scoring approach is 
used. 



Meihodology 

Developi.. - it of Scoriny Prnnfis;^ 

During the summer of 1988 the Geometry Advisory Group (GAG) reviewed 
student responses to sample geometry proofs developed by by selected geometry 
teachers from across the state. The proofs had been field-tested during the late 
spring of 1988 m selected schools across North Carolina. Borrowing from 
successfol performance task measurement applications in writing, the scoring 
approaches considered were analytic and focused-holistic. 

Initially, the GAG favored the analytic approach since specific errors could 
be marked. After a period of study ai>d some calculations, the GAG found that the 
analytic approach could result in as many as twenty-seven (27) different scoring 
^des for one prnofi This was partly due to the fa' that the GAG strongly felt 
that any proof administered statewide should be • tvable fi-om multiple 
approaches. The GAG recommended use of the iocused-holistic approach to 
assess student performance on geometry proofs for four reasons: 1. the ability to 
develop a single scoring guide containing various strategies for solving the proofs 
amenable to the focused-holistic scoring approach; 2. the belief that training of 
s^rers could better be accommodated using the focused-holistic approach: 3 the 
efficiency and speed of focused-holistic scoring; and 4. the previous success of the 
tocused-holistic approach with writing assessment in North Carolina. The group 
determined the scoring criteria and score point descriptions to be used in the 
statewide field test. The score scale contained five (5) points with a four (4) 
reflecting a nearly perfect proof and a zero (0) reflecting a blank or completelv 
erroneous proof (see Appendix B). 

The focused-holistic scoring process has been widely used in the 
assessment of writing (e.g.. Cooper, 1984; Stevenson, 1988). This approach 
considers the overall sense of completeness of the writing sample based on a pre- 
determmea set of criteria. To appropriately use the focused-holistic approach 
agreement has to be made on the criteria to be used and characteristics of the ' 
cntena at each score point level. Each individual data sample is read and 
evaluated (using the established criteria and score points) by one or more raters. 
Cooper (1984) nas reported scorer reliability in writing assessment-related studies 
for two or more readers ranging fi-om .41 to .89 varying with the background and 
length of traimng on the scoring criteria. The NCDPI has reported perfect scorer 
agreement rates of more than 70%, adjacent agreement rates (no more than one 
point difference between the two ratings) approaching 30% and less than 7% 
dittenng by more than one point on a four-point focused-holistic score scale. 

Staff* Development a nd Awarensss Training 

NCDPI Mathematics and Testing Coordinators based in the eight regional 
centers were trained to use the scoring guides during November 1988. These 
Coordinators then conducted regional Geometry Proof Awareness Sfcssion(s) 
attended by at least one (1) geometry teacher from eaeh school in the region that 
provided geometry instruction during December of 1988 and January of 1989. 
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Participants were trained to score the sample proof and informed of the 
logistics of the spring 1989 statewide field-test. x\fter developing skill in scoring 
practice sets of geometry proofs, participants received scoring guides, training 
practice sets and scoring keys for use in training other geometry teachers and 
exposing students to the test format, scoring process ana assessment criteria 
prior to the actual field-test administration. Many of those same geometry 
teachers returned during late April to score the geometry proofs of students fi-om 
their regions. 

Data Collection 

Proofs were administered to more than 43,000 geometry students during the 
period March 20 - April 7, 1989 on scannable 11x17 folded answer documents. 
■ Student identification information was printed on page 1, directions on page 2, the 
common proof exercise on page 3 and one of four variable proofs on page 4. The 
four different forms, identified by the four variable proofs, were printed in 
different ink colors for ease in identification during scoring. The four forms were 
spiraled throughout each classroom. 

The central ofBce Test Coordinator for each of the school systems collected 
completed geometry proof field-tests and forwarded them to the Regional 
Research and Testing Coordinator. Proofs were shipped firom regional centers to 
an outside contractor so that identifiable data linking a proof with a specific 
student, school or school system were removed to eliminate those factors as 
potential sources of scorer bias and to allow scoring packets to be developed. The 
outside contractor separated the proofs into four form/color groups, printed packet 
identification sheets, and stapled these sheets to the proof sets. Four scannable 
monitor sheets for recording independent scores were produced for each packet. 
The first two listed proofs in the order they were in face up in the packets. The 
other two listed the proofs in reveree order for use in scoring the variable proofs on 
the reverse side. The four sheets were inserted into each packet along v,ith the 
proofs. 

During scoring reader one removed monitor sheet number one from the 
packet, recorded a reader identification number^ and verified that the proofs in 
the packet matched the proof identification numbers on the monitor sheet. After 
scoring the proofs, <^eader one returned the proofs to the packet envelope, which 
still contained the other :;hree monitor sheets, and placed the monitor sheet used 
on top of the packet. NCDPI staff retrieved completed packets, reviewed the 
monitor sheet, and randomly re-ciruilated the packet to a second reader. The 
completed monitor sheets were then scanned on NCS Sentry 3000 tabletop 
scanners connected to an IBM personal computer. Data were stored on floppy 
diskettes using a sofiiware pro-am developed by NCDPI. In addition, reader 
reHabiHty reports were generated to monitor reader agreement and progress in 
scoring. Highly discrepant readers were retrained and proof scores requiring 
resolution (discrepant by more than one score point) were identified and resolved 
on the spot by specially trained scorers. 
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xmriTiP °^ diskettes produced in the eight regions were merged at the 
NUDFl state testing office with student background information provided by the 
outside contractor on data tape. Rosters of student scores were returned to 
geometry teachers prior to the end of the school year so that final scores could be 
coded on answer sheets for the EGG multipie-choice geometry test: All EOC tests 
were scored at a high school or central office site in each school system and 
rosters of scores were produced for use in assigning grades to students. 
vGeometry grade rosters included core multiple-choice and common proof scores ) 
Data diskettes with all EOC scores were forwarded to the Regional Testing 
Coordinator for final (in-region) editmg and shipment to the Testing Section 
where the statewide report was prepared (see Appendix F). Summary r*»ports of 
proof scores were generated in this fashion for 43,926 secondary school students. 

Tv/r multiple-choice proofs were also field-tested in selected sites dunv^ 

May 1989. The multiple-choice proofs paralleled proofs administered during " . 
March-Apnl 1989. ^ 

Reader Training- an^ the Snnrinjr Process 

Each school system provided a minimum of one (1) geometry teacher fi-om 
each school where geometry was taught to participate in the regional scoring 
process. In the three largest regions, two fiill day scoring sessions were held All 
ot the regional sessions were held on school days. School systems paid substitute 
teacher expenses for geometry participants fi-om their school systems 
Participating teachers received certificate renewal credit for their participation, 
/om of teachers involved in the scoring process ranged fi-om twenty-nine 

C29) m the smallest region to ninety-one (91) in one of the larger regions. 

On the first day of scoring, geometry teachers selected by the school systems 
were tr^ned by one of three scoring directors to use the common proof scoring 
^^'J^t ' ^^^^f^ scored three (3) training sets of 10 to 15 proofs ( 35 total) 
each that provided exposure to the scoring characteristics and distinctions 
between each score point and the variability within score points. Finally, teachers 
took a qualifying exam witti 70% accuracy (perfect agreement with the designated 
score point) required m order to be eligible to score common proofs. Teachers who 
tell below the criterion were re-trained and administered additional qualifying 
exams. The entire training and qualifying process took approximately three 
hours. Statewide, only 1% of the 423 readers failed to qualify to score. 

Common proofs were read twice. The first reader assigned a rating (0 - 4) 
to the common proof based on the criteria (see Appendix B for the score scale) 

T '^^r "J" ^^'^^ P°^^* ^^^^ averaged (e.g., if rater one 

assigned a rating of "1" and rater two assigned a rating of "2", a score of 1 5 was 
assigned as the proof score). When ratings assigned by a reader differed by more 
than one (1) point, a staff person fi-om the NCDPI Mathematics Section or a 
participating GAG member read the proofs with discrepant scores and assigned 
the final score. ^. 
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After scoring of the common proof was complete, section leaders trained 
participants to score the variable proofs. Section leaders were either members of 
the GAG or teachers from the winter awareness sessions that demonstrated skill 
and understanding of the rating process as observed through the winter practice 
exercises. Each section leader provided training on specific characteristics of 
their variable proof Each variable proof was initially read once with second 
reading occurring only if time permitted. Scores from the variable proofs were 
not considered bv teachers in e valuating student course performannfi . 

Sample 

The sample for the proofs and the comprehensive geometry multiple-choice 
test was the entire statewide enrollment in geometry classes in North Carolina, 
more than 43,000 students. The separate, 32-item multiple-choice proofs field test 
was admiristered to a convenience sample of 875 students from schools in each of 
the eight educational regions. 

Measures 

Each student completed two proofs, one common and one of four variable 
proofs (see Appendix B for the proof exercises). Each common proof was scored 
twice. Most variable proofs were scored once, but a substantial portion were 
scored twice. This resulted in four reader scores for each student, and as many 
as ten reader scores across the sample of five geometry proofs. For the common 
proof, and when possible for the variable proofs, scores were combined to produce 
composite scores. All students also took a comprehensive multiple-choice 
geometry test at the end of the year. In addition, a sample of students ^ook a 
multiple-choice test focusing on the same five proofs. Six to seven items were 
specific to each of the five proofs. Teachers recorded the final course grade they 
expected to give each student at the end of the year, the course grade as of the 
proofs assessment in the spring, and a grade assessing the student's proofing 
skill at the time of the proofs assessment. The following list gives all the variables 
used in this study. The N counts in parentheses are the coimts for all analyses 
using these variables. When two or more variables are related with differing N 
counts, the analysis is based on the lower of the two N counts. 
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Focused-holistic scores of proofs on a scale of 0 (low) to 4 (high): 

1. Common Proof, Reading 1 (N:s43,926) 

2. Common Proof, Reading 2 (N=43,926) 

3. Variable Proof A, Reading i rN=ll,177) 

4. Variable Proof A, Reading 2 (N=2,773) 
6. Variable Proof B, Reading 1 (N=ll,017) 

6. Variable ProofB, Reading 2 (N=5,612) 

7. Variable Proof C, Reading 1 (N=10,925) 

8. Variable Proof C, Reading 2 (N=4,951) 

9. Variable Proof D, Reading 1 (N=10,807) 

10. Variable Proof D, Reading 2 (N=4,304) 

Focused-holistic score composites 

11. Common Proof: (l+2)/2 (N=43,926) 

12. Variable Proof A: (3+4)/2 (N=2,773) 

13. Variable Proof B: (5+6)/2 (N=5,612) 

14. Variable Proof C: (7+8)/2 (N=4,951) 

15. Variable Proof D: (9+10)/2 (N=4,304) 

Multiple-choice test scores 

16. NC Test of Geometry: score on 60-item core test (N=43,325) 

17. Multiple-choice proofs test: score on 32-item test of same 5 proofs 
(N=875) ^ 

18. Multiple-choice common proof test: score on 6-item subtest (N=875) 

19. Multiple-choice variable proof A test: score on 7-item subtest (N=217) 

20. Multiple-choice variable proof B test: score on B-item subtest (N=221) 

21. Multiple-choice variable proof C test: score on 6-item subtest (N=218) 

22. Multiple-choice variable proof D test: score on 7-item subtest (N=219) 

Instructor's ratings 

23. Course grade in geometry at end of the year (N=43,067^ 

24. Grade in geometry at time of proofs assessment (N=43',400) 

25. Grade in proofing skill at time of proofs assessment (N=43,'l03) 

Results 

Aerreement T?flf.< .^ 

Tables 1 and 2 summarize the reader agreement rates for the geometry- 
proof field test. On the common proof, approximately 66% of the proofs received 
the same score on two different readings. Adjacent agreement, or the percentage 
of proofs receiving scores within one pomt of each other, was 30.7%, and 3 4% of 
the common proofs received scores differing by more than one point and were 
third read by a specially-trained scorer. 

Agreement rates on the other proofs varied somewhat by type of proof, from 
a low of 65.6% to a high of 80.6% perfect agreement. The highest agreement rates 
occurred for the three-dimensional proof (variable proof B) and the parallel line 
proof (vanable proof C), both of which were difficult, with almost 60% of the scores 
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either a 0 or 1. In addition, variable proof C appeared to be one where students 
"either knew it or they didn't". Although there were a large number of 0 and 1 
scores, students also received a relativeiy large percentage of 4 scores. Lower 
agreement rates were evidenced on proofs with the largest percentage of scores in 
the 1 to 3 point range, i.e. in the middle of the score scale, where it is usually more 
difficult to score accurately. 
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Percenlage Agreement Between Two Readings of Geometry Proofe 



Proof 

Common 
A 
B 
C 
D 

*1 point difference 



Perfect 
AgL'eement 

65.9% 
68.1% 
73.8% 
80.6% 
65.6% 



Agreement* 

30.7% 
30.3% 
25.7% 
18.4% 
32.2% 



DiSerence 
Re^ution 



3,4% 
I\'6% 
0.4% 
1.0% 
2.1% 



Table 2 gives the agreement rates for the various scoring sites. The perfect 
agreement rates ranged between a low of 62.8% to a high of 68.3%, with 5 of the 8 
sites having rates of approximately 65 to 66%. Different readers ^eTe involved at 
tL eigit'sHes ' ° directors/trainers w re us^^^^^^^^^^ 



Table 2 



Percentage Agreement Between Two Readings of Common Prx>of 

for Each Scoring Site 



Scoring Site 


Perfect 
Agreement 


Adjacent 
Agreement* 


Difference 
Requiring 
Resolution 


1 
2 
3 
4 
5 
6 
7 
8 


65.4% 
65.9% 
67.3% 
62.8% 
65.0% 
68.3% 
65.1% 
66.0% 


31.6% 
30.6% 
29.7% 
32.9% 
31.8% 
28.7% 
31.6% 
30.3% 


3.1% 
3.5% 
3.0% 
4.3% 
3.3% 
3.0% 
3.3% 
3.7% 



*1 point difference 
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Correlatio nal Estimates of Reliahilitv 

A ccmincn method of assessing essay scoring reliability is to correlate the 
scores assigned by different readers to the same essay. As noted by Breland et al. 
(1987) these estimates are inflated because they reflect only one source of error. 
Table 3 gives the correlations between two independent scorings of the same proof 
for the common proof and the four variable proofs. These estimates of the 
reliability of one reading of each proof rauge between .822 and .948. Breland et al. 
(1987) also point out that these estimates can be "stepped up" using the Spearman- 
Brown formula to obtain estimates of the reader reliability with two readings of 
each proof. 





TakAeS 






Correlational Estimates of Reader Reliabilities 


Proof 




Reader Reliabiliiy 


r 


Estimate for 2 Readings 


Common .871 


.931 


A 


.869 


.930 


B 


.822 


.902 


C 


.948 


.973 


D 


.864 


.921 



Since students responded to two proofs each, reliability estimates can also 
be calculated by correlating the scores on each proof. Table 4 gives these 
correlations for one reading, tv/o readings of the common proof and one reading of 
the variable proof, and two readings of each proof The three correlations per 
proof combination give the reliabilities of giving one proof under the three 
different scoring conditions. The estimates for one proof read once range from 
.522 (Common vs. B) to .627 (Common vs. A), with an average of .590 across proof 
types. As would be expected, as the number of readings increases, the reliability 
estimates increase slightly. 

The last column gives reliability estimates for two proofs obtained using the 
Spearman-Brown formula. These estimates demonstrate that the reliability of a 
proof assessment can be increased dramatically with the addition of another 
proof. 
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Proofe 
Common vs. 



Correlatic^ Estimates of ReliabiKties of Proofe 
Receiving One Beading or Two Headings 

Niimberof 
Reading 



Common vs. B 



Common vs, C 



Common vs, D 



Average 



2 common 
2 each 



2 common 
2 each 



2 common 
2 each 



2 common 
2 each 



2 common 
2 each 





jtv6iiaoiiii^ £iSi3iiiaie 


r 


for Two Proofe 


.627 


.771 




.787 


.664 


798 


coo 


.686 


coo 
.OOO 


.700 


.004 


.721 


.619 


.7C5 


.642 


.782 


.649 


.787 


.590 


.742 


.610 


.758 


.634 


.776 


.590 


.741 


.610 


.757 


.628 


.771 
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Predictive Validity of the Proof Assessment 



The analysis below gives the predictive validities of both the proofs 
performance assessment and two midtiple-choice tests. A total of five outcomes 
can be analyzed for the proofs themselves: 



1. COVrsg grade-grade instructor espects to give the student, reflecting 
overall geometry performance, not just proofing ability; 

2. proofs grade-inat.r»nt/ir judgement of proofing skill at time of 
assessment ; 



3. MC proofs tePt---32-item multiple-choice test of same five proofs; 

4. MC proofs sybtestr-e to 7-item subtest of MC proofs test of the same proof 
as solved by the student; 

5. MC geQTngtry--60-item test covering entire geometry course content. 

Table 5 gives the correlations of the various proofs with the five outcome 
variables for one and two readings of each proof As would be expected, the 
validity estimates are sHghtly higher for the scores based on two readings, 
reflecting their higher reliability. When two proof scores are combined (one 
reading) between .05 and .10 is added to the predictive validity related to proofs 
grades. Reading ihe proofs twice adds negligibly to the correlations. Also as 
expected, the correlations are somewhat higher when related \,o grades in 
proofing skill rather than overall geometry performance. 

Correlations with the multiple-choice proofs test are generally higher than 
those with the proofs grade, reflecting the higher reliability of the 32-item test 
than of teacher judgements about proofing skifl. The subtest scores are for the 
multiple-choice items that relate to the same common or variable proof 
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Predictive Validities of Proof Assessments 



^Correlations with- 



xTaUOX 


Number of 


Course 






MCioxxxfis 


MO 


Readinifs 


G/ade* 


Grade 


ProofeTest 


Subtest 


Geometry 


Common 


1 


.511 












2 


.528 


.603 






•OZQ 


A 


1 


.519 


.614 


.644 


.561 


.614 




2 


.534 


.629 








Common+A 


1 


.570 












2 


.580 


.683 








B 


1 


.480 


.548 


.557 


.457 


.573 




2 


.507 


.574 






Common+B 


1 


.579 


.658 










2 


.596 


.674 








C 


1 


.542 


.632 


.650 


.651 


.686 




2 


.554 


.644 








Common+C 


1 


.589 


.685 










2 


.598 


.693 








D 


1 


.521 


.622 


.577 


.530 


.640 




2 


.543 


.647 






Common+D 


1 


.576 


.682 










2 


.593 


.698 









♦Obtained at time of geometry proof assessment. 

Note: Missing cells are due to the fact that the datasets containing the multiple- 
choice tests mclude only the final scores on the geometry proofs, which were the 
combmed readings for the common proof and one reading for each of the 
variable proofs. 



Table 6 displays the distribution of scores for students who participated in 
the multiple-choice proofs field test. More than half of the students who could not 
complete a proof at all on the performance assessment (scores of 0 through 1.0) got 
4 to 6 of 6 items correct on the same proof in a multiple-choice (completion) 
format, ana almost one-quarter received perfect scores in this format that 
requires only recognition, rather than recdl and production. These results are 
somewhat confounded by the fact that the students had already responded to this 
proof at the statewide administration several months earlier. However, most of 
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the teachers in the multiple-choice proofs assessment reported that they had not 
reviewed the proofs after the spring administration. 



Tables 

Frequency Distribution of Focused-Holistic Scores 
on the Common Proof andMultiple-Choica Scores 
on the Same Proof Topic 



Multiple* 
Choice 

Proof Scores -Focused-Holistic Proof Scores 

0-1.0 1.5-2.0 2.5-aO 3.5-4.0 Totsk 

0-1 2.8% 0.0% 0.0% 0.0% l.l>% 

2-3 17.5% 10.0% 0.6% 1.0% 8.7% 

4-5 56.3% 57.1% 39.2% 22.6% 45.0% 

6 23.4% 32.9% 60.2% 76.5% 45.3% 

Totals 36.5% 19.4% 20.7% 23.2% 100.0% 



N=875 



Table 7 gives the correlations between the two multiple-choice tests and the 
two instructor ratings. The 32-item multiple-choice proofs test correlated .527 
with proofs grades, while the correlations between the performance-based proofs 
and proofs grades ranged between .574 and .698, depending on the type of proof, 
the niimber of readings, and the number of proofs (see Table 5). If differences in 
reliability of the tests were taken into account, the difference in predictive validity, 
using proofs grade, would be even greater. 





Tabled 


Predictive Validities of Non-Essay Assessments 


Multiple Choice 
Test 


Course Proofe 
Grade Grade 


Geometry 


.639 NA 


Proofs 


.406 .527 
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Finally, Table 8 gives the multiple correlations (R) when the proofs 
performance test is combined with the multiple-choice tests to predict grades 
Note that tb-^ proofs assessment adds between .02 and .05 to the predictive 
validities. 



Tables 



Predictive Validities C!ombimng Comn. on Proof Score 
andMultipIe-Ohoice Components 

Multiple-Choice Course Proofe 

Test* Grade Grade 

Geometry .659 NA 

Proofs .434 .577 

♦Test combined with focused-hoHstic proof score. 



Feasibility of Statewidft Pr^i^fp Pe rformance Asssssmfints 

Analyses of the Testing Section of the NCDPI indicate that the specific cost 
to the state of Isorth Carolina for conducting the geometry proof field test was 
approximateJv $3.00 per student. This cost includes a curricdum consultant 
matenals, development, training, scoring, and report generation. Excluded are 
costs for travel, some facilities expenses, and the salaries of staff of the NCDPI 
For SY 1990 the cost is estimated at approximately $2.44 per student. 

The statewide field test demonstrated that student responses to proof 
problems can be scored and reported in a reliable and relatively cost-efificiert 
manner. The logistics developed for this aspect of the EOC testing program* are 
quite feasible, and could be generalizable to other statewide performance efforts. 

Discussion 

The results of thi^, study indicate that scorer reliabilit} n proof ratings was 
r \m- ^®"^0"strated both by the agreement rates and the correlational 
rehabihty estimates for the common proof and the variable proofs which were 
scored twice. This finding is of particular interest since the scoring involved over 
400 raters reading proof papers distributed among eight different scoring sites 
Five of the scoring sites (1, 2, 4, 7, and 0) are largely rural yet the reader 
agreement rates for the rural sites were similar to tLJse in the large urba i 
centers. The consistency of scoring across sites and different groups of readers is 
testimony to the clarity of the. scoring guide, the consistency of the scoring 
cntena, and the willingness of teachers selected as scorers to accept the scoring 
process. Actual time devoted to training was less than three hours during which 
time the sconng guide was reviewed, three sets of proofs wer*, scored and 
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discussed, and qualifying rounds were held. The context effects were minimized, 
resulting in more reliable scoring (Keeling and Baker, 1985). 

Further analyses of the relationship between scores on the two proofs each 
student took indicates that the addition of one extra item contributed dramatically 
to the overall reliability of the proofs test, beyond that of scorer reliability alone. 
Except for the addition of training time on the second proof, the cost of scoring one 
proof twice is similar to the cost of scoring two proofs once, and may be more cost- 
effective due to the increase in overall reliability. In this assessment students 
could not receive scores based on two proofs because the studentcs took one of four 
variable proofs which differed in overall difficulty. The primary purpose of the 
variable proofs was to provide broader curriculimi coverage by assessing five 
proofs in every schools. 

The relationship between the proof scores and other measures of proof 
performance and geometry performance indicate that the performance-based 
proofs measures are valid indicators of proofing ability. Furthermore, the 
performance-based proof scores were more highly related to grades in proofing 
skill than were the scores on a multiple-choice completion test format. This 
finding lends support to the subjective impression of many educators that 
performing actual tasks are more valid the multiple-choice tests related to those 
tasks. Not only do these tasks have face validity, but a degree of predictive validity 
as well. 

In a time of performance or outcome based accoimtability systems and 
measurement-driven instruction, the measurement of skills in a more authentic, 
performance-based, manner takes on additional meaning. Suhor (1985) reported 
that both a poll of 350 language arts supervisors and research data indicate that 
writing instruction decreases with objective tests, and increases where direct 
writing assessments are implemented. This has certainly been the experience in 
North Carolina. One purpose, therefore, of performance-based assessments like 
the proofs test is that they encourage certain types of instruction. This field-test 
demonstrated that alternative strategies for assessing student ability can be 
implemented in an objective and reliable fashion. The findings were that 
teachers can be trained to score proofs with a high degree of consistency. The 
approach used by North Carohna to measure student performance on a geometry 
proof could be adapted for use in other subject and skill areas, and by any school 
system. 
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(North Carolina Geometry Goals and Objectives) 
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NORTH CAROfJNA l^f 
GEOMETRY no^ r 



GnsLh The learner wjll state thg cha ncleristiet nf ^.tc nf p^ jpt^, 

1 . Identify and name sets of points, such as line, ray, scgmcot and plane. 

2. Draw representations of points, lines, and planes. 

3. Identify and name unions and Intersections of sets of points. ' 

4 . Find the coordinate of a point on a line. 
3. Find the length of a segment. 

6. Identify congruent segments. , .'; , 

7. Identify the midpoint of a given segment. .. , 

8. Use a protractor to find the measure of an angle. ' 

9. Determine when two angles are congruent ' 

1 0. Identify interiors and exteriors of geotnetric figures. ' 

1 1 . Identify the bisector of an angle. 

GsaUi TT<g learner wjjj use the str^ctMral propediM nf th^ ^^3) n,,^, ^ ^ , 

1 . State and use the properties of equality. 

2. State and use the properties of inequality. 

CogI 3; The learner will develop y e nmetrie pmnf< , 

1 . Translate a geometric statement into an "If-Then Statement". 

2. State the converse of a conditional statement. 

3 . State the hypothesis and conclusioa for c conditiooai statement. 

4 . Use the process of deductive reasoning in mathematical and non-mathematical 
situations. 

5. Write a proof using the Iwo-columa format. 

6 . Write an indirect proof. ! 

^liiSUl The learner win use some of the nmpertlM nf ang in a and !lnM tn rfftv..fff | y y frftftf 

and solve exerci<g^. 

1 . Use three letters, a number, or a single letter to name an angle. 

2. Classify an angle. 

3. Identify adjacent and vertical angles. 

4 . Determine U;c complement and supplement of a given angle. 

5 . Apply the Angle Addition Postulate. 

6. Apply the Segment Addition Postulate. (Definilloa of Betweenncss) 

7. Recognize congruent angles. 

GSiSLSi The learner will recnpnlre perpftnil|gyi>)fl> {fpf 1 uni n ianes and use thl} 

infofmation to complete proofs and eper^jsgg, 
1 . Apply definitions of perpendicular lines and planes. 




4D-0F.C0URSE TESTING ■ Hi 

.S AND OKIECTIVES 



QsaiJl The teamerwill recognize parallel lines >nd planea and use thii tnnw|c(fgff {«^ ; ' 

ggmplcte proofs ind c;<crci5M. 

1. Ideolify paraSiei lines and plstics, and skew lines. 

2. Identify corresponding angles and allemale iaten'or angles which are formed when ' 
two parallel lines are cut by a transversal. ... . ^ .- 

3. Stale conditions under which lines are parallel. 

4. State which angles are congruent vyhen two parallel lines are eut by a transversal. 

5. Identify which angles are supplemeataiy when lines are cut by a transversal. 

Goni 7; The learner will tdentlfv polvgOM ind complete proo fs and cxercUea related tn 

!c Classify a iHsngte iccording fo its sides. 

2. Classify a triangle according to its angles. 

3. Classify t polygon according to the number of Its sides or angles. 

4. Classify a convex poiygoa according to the measure of its angles. 

5. Apply the fact that the sum of the measures of the angles of a triangle is 180. 

6. Fir id the measures of the exterior angles . of a triangle. 

7. Find the measures of the Interior aod exterior angles of a convex polygon. ! 

8. Apply tlie characteristics of various quadrilaterals. 

Gcyal 3! The learner will Identify congruent triangles and complete proofa a nd exerche^ , 
* related to them. . . 

1 . List the corresponding parts of Iwo congruent triangles. 
.2. Use various postulates and theorems to prove two triangles are congruent and their 
corresponding parts congruent. ! ' 

3. Identify the altitudes and medians of triangles. 

^ 4. Apply the theorem about the segment Joining the midpoints of two sides of a V 
triangle. :j: 
5. Apply the theorem about the intersectioQ of the medians of a triangle. 

Coal 9t The learner wilt demonstrate when two polygons are similar and develop pfnofg 

ind aolvg CKftiaca fclatgd to ihem. 

1 . Identify regular polygons and determine the measures of the angles. I 

2. Solve a proportion. . ! ^ 
3< Use proportions to solve geometric problems. _ .^t 

4. Find the geometric mean of two numbers. • . ' ' 

5. Determine whether or not two polygons are similar. 

6. Prove two triangles are similar. 

7. Apply properties of similar triangles to find corresponding proportional sides. 

8. Apply theorems which jnvolve dividing segments proportionally. 
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fm\ Ifl?- — The learner will state some of the characteri^Hc^ ^^f a rig ht and solve 

cxgrgiscs related to {hem, 
. , State two relationships that exist in a right triangle. 
. V j' » ' 2. Use the Pythagorean Theorem and its converse to fmd the lengths of ihe sides of a 
^ "S^' triangle or t quadrilateral* 

\ ' 3. Use the relationships that exist In special right triangles to solve problems* 
S< 4. Using i table and/or calculator, apply the definitions of sine, cosine, and tangent to 
solve right triangles. 

tjljj p ffoal 11? The legmer will list some characteristics of a circle and ftffvg|pp p^ftfl g"^ 

solve exercise:^ related to them. 
f 1 * 'he definitions of a circle and the lines and segments related to it« 
, 2. Recognize polygons inscribed in or circumscribed about a circle. 
^* APP'y properties involving arcs and angles of circles. 
V^KCr ' ' ^* Apply ihe theorems about the chords of a circle. 

Vif '} S. Apply the theorems that relate to the tangents, secants, and radii of a circle. 

I* ' * 

<iOgi.l2t The learner will find the perimeter area, and volume pf geometric fl etirei, 

1 • Find the perimeter of a geometric figure. 
V.' 2. Compute the area of a triangle, parallelogram, trapezoid, and rectangle. 

> ? 3. Find the ratio of both the areas and the perimeters of similar triangles. 
' V. 4. Compute the apolhem, radius, and area of special regular polygons. 
[ 'J' ^ 5. Compute the circumference and area of a circle. 

. . 6. Compute arc lengths and the areas of sectors of a circle. 
.7. Identify and describe space figures. 

'8. Compute the lateral area, total area, and volume of a right prism or pyramid. 
r]^ ! 9. Compute the lateral area, and volume of a right circular cylinder or cone. 

"'I*, fioal 13; The learner will complete a geometric co nstruction and d escrib e % tf>cus of a 

point or poinU, 

V 1 . Construct a segment congruent to a given segment, 
t 2. Construct an angle congruent to a given angle. 

\ ' 3. Construct the bisector of an angle. 

4. Construct a line perpendicular to a line through a point on the line. 

V ^. ' ^5* Construct a line perpendicular to a line through a point not on the line. 

6. Construct the perpendicular bisector of a segment 

7. Construct a line parallel to a line through a given point 

8. Construct the tangents to a circle from a point outside the circle. 
* 9. Circumscribe a circle about a triangle. 

I: 10. Inscribe a circle inside a triangle. 

i:^/ 1 1. Divide a segment into a given number of congruent segments. 

1 2. Given three segments, construct a fourth segment such that Uie lengths of the four 
segments are proportional, 

^fl ; ; 13. Construct a segment whose length is the geometric mean between the lengths of hvo 

I given segments. . 

ipt V 14. Construct quadrilaterals which meet certain criteria. 

; ^ 15. Construct a circle through three non-collinear points. 



*These objectives would be included in an enriched course but not in a basic course. 



fift^l 14t The teamer will Invesligate some of the prnper«ie< of cnnrdina te geomOiy , 

1. Write the coordinitts for a poiot in ihe coordinate plane. 

2. Write equations for vertical and horizontal lines in tiie coordinate plane. 

3. Use the distance formula to solve problems. 

4. Use the midpoint formula to find the coordinates of the midpoint or endpoint of a 
segment. 

5. Find the slope of Ihe line giv^n two points nn the line. 

6. Find the slope and y-intercept of a line. 

7. Write an equation for a line which is parallel or perpendicular to a given line. 

8. Write the equation and draw the graph of a line when given either two points on the 
line, one point and the slope of the line, or the slope and y-inlerccpt of (he line. 

9. Use coordinate geometry to prove some of the properties of polygons. 
" 1 0. Write an equation of t circle given its center and radius length. 

11., Find ihe center and radius length of c circle given an equation. 



Appendix B 

(Rated Samples of Geometry Proofs 
and the Score Scale) 
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Ite Score Scale 



^^°*^5f examples of the scortii points are given on the fcllowing pages. Note 
that although proofs may differ in difficulty and complexity, the criteria for c»ach 
score point should remain the same. Differences in difficulty will then be evident 
m ttie proportions of students receiving each score point. In addition, the score 
scale IS not meant to be interval in nature; the difference between a 1 and a 2 will 
not be the same as the difference between a 2 and a 3, etc. Jiist as there are many 
varieties of 'B' students, there will be relatively wide variations in quality within 
score pouits and these will occur in some score points more than ethers. 

4 = Thg rc;n?onse demonstrates a clear imrierstanding of the pmoF. The proof is complete. All 
logical steps are given and wording is accurate. AU statements arc logicaUy sequenced and 
aU reasons are correctly aligned with these statements. Mathematically equivalent variaridns 
of the answers given in this guide are given a score of Complete and geometrically 
correct proofs arrived at by different methods than those presented here arc also given a 
score of 4, as long as the logic is sound. Unconventional wording and abbreviations, minor 
misspellings, coixect, but irrelevant, statements that do not seriously detract from the 
solution as a whole are allowed as lohg as the statements and reasons are mathematically 
correct. Proofs scored a 4 do not contain any incorrect statements or reasons, even if the 
incorrect information is irrelevant to the proof 



^ "S*^ response exhibits a reasonable cnirimanrf nf g^nmptrir intri cin developing the proof. 
The proof indicates considerable thought and sound logic in the sequence of ' tateratnts and 
reasoiis, but may be lacking in precise notation, wording of theorems, posmlates, etc. The 
proof IS generally coherent and complete overall, with major steps always prcjent, although 
imnor weaknesses are present, i.e. a par?, of a step such as a reason may be missing or stated 
mcorrecuy if the corresponding statement is present and correct or incorrect irrelevant 
statements may be present. 

2 =The response demonstrates a weaknes s in geometric logic in developing the proof. A proof 
is attempted but is not complete in logic or sequence of statements and reasons. In some 
proofs, although the smdent demonstrates a fair understanding of the problem, he or she has 
omitted or mcoirecdy stated a major step(») (including the given) required of the proof 
Statements and reasons following an incomect step may be logical and geometrically sound 
but they follow from a false conclusion. In other responses the sequence of logical steps is 
not mamtamed to the extent that it detracts from the solution. 

1 = Thg Kmm gXhibitS a lack of command of geometry in developing the nmnf There is 
evidence that the student has seen the problem and has attempted the proof, but the proof is 
off-base. The student demonstrates a vague knowledge of the steps in the proof, but there 
IS very htue substance to the proof The fiist and last steps may be present, however the 
majonty of the intervening statements and reasons are incorrect or irrelevant. The proof 
must contain some bit of relevant and conect information other than the giv«n and the prove. 



0 = Either the proof is not attempted, the paper is blank, or only the given and/or prove steps arc 
present, or all other steps are totally (statement and reason) incomect or irrelevant. Nothing 
IS correct except the given and/or prove steps. 
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Use the figure to prove the following exerc 

B 

1. Given: Figure ABCD is_a_rectangle 
with diagonals BD and AC. 

Prove: AAEDsACEB 

A 

Statements 

























. le- ore. X. ar^ ■= ^ 






















Score Point 0. 





All steps other than the given and prove are totally incorrect or irrelevant. 
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Use the figure to prove the following exercise, 

B 



1. Given: Figure ABCD iaa_rectangie 
with diagonals BD and AC. 

Prove: AAEDsACEB 




Statements 



Reasons 





(rxv&tf 






















« A£ i^.-"^ ft/?" 5^ 06 













Score Point 1. 

The response exhibits a lack of command of geometry in developing the proof. 
The student demonstrates a vague knowledge of how to prove that two triangles 
are congruent, but there is very Httie substance to the proof. The only bits of 
relevant and correct information in this proof, other than the given and prove, 
are the first part of Statement 2, Statement 3, and Statement 6 (too late though!). 



. , Use the/figure to prove the following exercise 

B 



1. Given: Figure ABCD isa_rectaiigie 
with diagonals BD and AC. 

Prove: AAEDsACEB 



Statements 




Reasons 



Jk4 



ns di&.yek.. arf rX. /k 



Score Point2. 

The response demonstrates a weakness in geometric logic in developing the 
proof. Step 4 and Reason 5 are incorrect. Steps 2, 3, and 6 are irrelevant. Also,' 
Statement 3 and Reason 3 do not "agrae"; and-'BC//^" must precede Statement 
5. The sequence of logical steps is not maintained to the extent that it detracts 
from the solution. 
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Use the figure to prove the following exercise. 

B 



Given: Figure ABCD is_a_rectangie 
with diagonals BD and AC. 

Prove: AAED=ACEB 



Statements 



D 



Reasons 



Score Point a 

The student demonstrates a reasonable understanding of how to do th;i proof 
using SSS meuhod, although a minor weakness is present. The student failed to 
state that the diagonals of a rectangle bisect each other. This is needed prior to 
Step 3. This om- sion is a minor weakness since Step 3 is present. Although a 
minor weakness is present, the proof indicates considerahle tl^ought and sound 
logic in the sequence of statements and reasons. 
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1. Given: Figure ABCD isa_rectangle 
with diagonals BD and AC. 

Prove; AAEDsACEB 





0 G i vejo 








A) (9 i-t- IS ^ rgcA^nc^k , 4^^<gKl 




3) 5c ^ Ab 








^) gl ^ Eb ^nd PTE <^ gc 












5) AAS.b ACea 


5) if SSS;4h<N, ^^-s are 











ScotePoii!t4. 

IHe student demonstrates a dear understanding of how to prove that hvo 
triangles axe congruent using SSS method by induding aU steps in the proof. • All 
statements are logicafly secuenced and aU reasons are correctly aligned with 
these statements. The abbreviationa used are acceptable. Although there are ' 
minor misspellings in Statement 1. Reason 2, and Reason S. they do not seriously 
detrart from the solution as a whole. The proof is accurate and complete. 
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VARIABLE PROOF A: PERPENDICULAR BISECTOR 



02. Given: In AABC, is the pen^endicular 
bisector of AC. 

Prove: AB bBC 

1. ^ ^ Age, -35k 1: 

Jiuiit-^:::fe(y\ AC. 
^- / EDA and Z.^!)C (^re, d Is 




I. G«oen 



5 5jA5 

t cPCTC 



Score Point 4 

nie student demonstrates aclear understanding of how to prove that two segments arc congruent 
by including aU steps in the proof. AU statements are logicaUy sequenced, and aU reasons arc 
correctly aligned with these statements. The abbreviations used are acceptable. The proof i 
accurate and complete. 



IS 




VARIABLE PROOF B: THREE DIMENSIONAL 



D2. Given: BD JL plane P at D and s EC 



Prove; AD = DC 




Score Point 4 

The student demonstrates a clear understanding of how to prove that two segments are congruent 
by including all steps in the proof. AU statements are logically sequenced, and all reasons arc 
correctly aligned with these statements. The abbreviations used are acceptable. The proof is 
accumte and complete. 
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VARIABLE PROOF G: i'ARALLEL LINES 



G2. Given: 



STbWU 
Sf II Wu 



4.. z^e^y 

5. 




Prove: 



= ^!^^t/y^ SKlz jbJy^ oud^+'^^-^io^rv^/i/^ 



Score Point 4 

The student demonstrates a clear understanding of how to develop the proof by including all steps. 
All statements are conect and logically sequenced, and all reasons justify these statements. The 
abbreviations used are acceptable. The proof is accurate and complete. 



VARIABLE PROOF D: SIMILAR TRIANGLES 



1 




Statements 



Reasons 



ff) Aitjiria. 



Score Point 4 

The student demonstrates a clear understanding of how to prove that distances are proportional by 
including all steps in the proof. AU statements are logically sequenced, and all reasons are 
conectly aligned with these statements. The abbreviations used are acceptable. The proof is 
accurate and complete. ^ g 



Appendix C 

(Percentage of Students Receiving 
Each Score on the 1988-89 Geometry 
Proof Field-test) 
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1988-89 Geometry Proof Field Test 
Quadralateral (Common Proof) 





lU 


ILi 


U 


Li 


2.0 


2.5 


3.0 


3 5 


4Ji 




State 


8.4 


5.5 


19.2 


9.4 


11.1 


8.6 


12.9 


7 2 


17.6 43,926 


Kegion 1 


6.0 


5.3 


18.0 


10. 1 


12.9 


9.6 


13.4 


6.6 


18.1 


2280 


Kegion 2 


8.1 


4.6 


20.4 


10.7 


12.9 


8.6 


11.1 


6.7 


17.0 


5090 


Region 3 


7.1 


4.6 


17.8 


9.0 


10.9 


8.4 


13.7 


7.6 


20.7 


7286 


Region 4 


11.3 


8.4 


22.6 


10.6 


10.8 


8.1 


10.1 


5.8 


12.3 


5147 


Region 5 


8.1 


5.0 


18.1 


8.6 


11.3 


10.1 


13.8 


8.0 


17.1 


8256 


Region 6 


10.5 


6.0 


21.7 


9.3 


9.8 


6.7 


13.1 


6.7 


16.2 


7942 


Region 7 


6.2 


4.7 


16.9 


9.2 


10.6 


9.4 


14.1 


8.3 


20.6 


4248 


Region 8 


7.3 


5.2 


16.0 


8.7 


11.8 


8.7 


14.0 


7.6 


20.6 


3676 



Perpendicular Bisector (A) 



State 


6.7 


• 25.7 


25.5 


20.3 


21.7 


11,177 


Region 1 


2.9 


24.1 


30.0 


15.5 


27.7 


582 


Region 2 


5.1 


26.7 


28.^ 


20.4 


19.3 


1294 


Region 3 


6.8 


20.4 


25.7 


19.6 


27.5 


1846 


Region 4 


8.1 


31.5 


• 25.0 


14.8 


20.5 


1310 


Region 5 


5.8 


24.2 


24.2 


22.3 


23.5 


2098 


Region 6 


9.0 


29.0 


25.7 


20.6 


15.7 


2027 


Region 7 


4.2 


22.0 


25.3 


26.2 


22.3 


1083 


Region 8 


9.2 


28.6 


21.8 


20.5 


19.9 


936 
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Three Dimensional (B) 
Percentage of Students Receiving Each Score 





Q.() 




2Ji 


3.0 


4.0 




otate 


15.5 


43.4 


32.4 


7.7 


1.0 


11017 


Region 1 


12.6 


48.0 


30.4 


8.2 


0.9 


573 


Region 2 


19.2 


51.4 


23.0 


5.8 


0.6 


1279 


Region 3 


13.8 


33.8 


41.6 


9.6 


1.1 


1827 


Region 4 


21.4 


53.8 


21 5 




A 1 
U.l 


1287 


Region 5 


13.3 


42.8 


32.5 


9.8 


1.5 


2066 


Region 6 


15.9 


43.4 


32.6 


6.9 


1.2 


1999 


Region 7 


13.2 


40.7 


36.3 


9.2 


0.7 


1057 


Region 8 


14.3 


38.2 


38.8 


7.6 


1.1 


929 








Parallel Lines (C) 








otate 


28.6 


29.7 


12.7 


9.4 


19.7 


10925 


Region 1 


29.1 


29.5 


14.3 


9.7 


17.5 


567 


Region 2 


28.3 


32.7 


11.4 


9.5 


18.1 


1263 


Region 3 


29.1 


27.1 


11.7 


9.9 


22.2 


1814 


Region 4 


35.3 


35.5 


10.4 


5.9 


12.9 


1284 


Region 5 


26.2 


28.4 


15.2 


10.4 


19.7 


2061 


Region 6 


31.9 


28.4 


11.0 


8.5 


20.3 


1970 


Region 7 


19.0 


29.8 


14.6 


12.2 


24.4 


1059 


Region 8 


28.2 


27.8 


14.1 


9.3 


20.6 


907 
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Similar Triangles (D) 



Percentage of Students Receiving Each Score 







LSL 


2.0 


3.0 


4.0 


a 




17.3 


37.5 


21.6 


15.6 


8.1 


10,807 


Kegion 1 


13.6 


36.4 


23.8 


16.7 


9.5 


558 


Region 2 


14.8 


43.1 


20.0 


16.3 


5.9 


1254 


Region 3 


14.9 


36.1 


19.7 


19.4 


10.0 


1799 


Region 4 


22.1 


45.3 


19.0 


9.2 


4.4 


1266 


Region 5 


17.2 


31.9 


24.3 


16.7 


9.9 


2031 


Region 6 


■ 20.2 


39.1 


21.0 


12.3 


7.5 


1946 


Region 7 


17.5 


31.2 


25.5 


16.9 


9.0 


1049 


Region 8 


14.7 


38.7 


20.4 


18.1 


8.1 


904 
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Appendix D 

(Descriptive Statistics for Proof Scores 
and Grades and Geometry Proof Focused-Holistic 
Score Scale Distribution) 
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Descriptive Statistics for Proof Scores and Grades 



Variable 

Common 
Proof 
Rating 1 



Number of Standard 

Students Mean Deviation Minimum Maximum 



43,926 



2.143 



1.327 



4.0 



Common 
Proof 
Rating 2 



43,926 



2.152 



1.331 



4.0 



Variable 
Proof 
Rating 1 



43,926 



1.705 



1.253 



4.0 



Estimated 

Geometry 

Grade 



43,400 



2.058 



l.-i47 



4.0 



Estimated 

Proof 

Grade 



i3,103 



1.848 



1.300 



4.0 
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Geometry Proof Focused-Holistio Score Scale Distribution 



0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 



Common 8.4% 5.5% 19.2% 9.4% 11.1% 8.6% 12.9% 7.2% 17.6% 

(N=43,926) 

A 6.7% 25.7% 25.5% 20.3% 21.7% 

(N=ll,177) 

B 15.5% 43.4% 32.4% 7.7% 1.0% 

(N=ll,017) 

C 28.6% 29.7% 12.7% 9.4% 19.7% 

(N=10,925) 

D 17.3% 37.5% 21.6% 15.6% 8.1?. 

(N=10,807) 
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Appendix E 

(A Summary of Teacher Assigned Proof Grade 
and Geometry Proof Scores: Common and 
Variable Proofs) 
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TCACHER ASSIGNED PROOF GRADES AND 
GEOMETRY PROOF SCORES: CORE PROOF 



Proof Grades 



Proof 
Scores 



0.0 
0.5 
1.0 
1.5 
2.0 
2.5 
3.0 
3.5 
4.0 



23.5 
12.9 
33.1 
10.7 
9.0 
4.5 
3.9 
l.Q 
1.4 



D 

% 

10.2 
8.1 
28.6 
13.8 
13.0 
8.8 
9.7 
3.1 
4.7 



% 

4.5 

3.6 
18.1 
11.1 
14.4 
11.2 
16.1 

6.8 
14.4 



B 

% 


A 
% 


AU 


1.4 


.5 


% 
8.3 


1-3 


.4 


5.5 


7.3 


2.7 


19.2 


6.3 


2.1 


9.5 


10.5 


5.9 


11.1 


10.3 


6.7 


8.6 


19.0 


16.6 


13.0 


12.6 


16.2 


7.2 


31.5 


49.0 


17.6 



chi-square = 17927.79 p<,001 
r = .60 



ft 



» 
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:VariableProof A 

Proof Grades P^T«^"dicuIa' Bisector 

%^ 5 S ^ A All 

s." ! i I I Jj '! .S 

3 5 3 134 o\i ??n ^2.4 25.4 

4 17 31.0 29.3 20.2 

7.3 19.0 38.2 54.8 21 8 



Proof Grades 



: Variable Proof B 
Three-Dimensional 



0 3J21 i| I i I ^£ 

Scores 2 ^9'? ^^.7 49.0 32.9 16.8 43.1 

3 2 ^7-2 32.3 

4 .0 ^0 ? 2^-3 37.7 

-3 1.1 5.0 1.0 



Proof Grades 



: Variable Proof C 
Parallel Lines 



J ^ C tj A All 

?oof 1 «-3 33.7 ,11 M 



3 fi 'H !;i 7.9 12.6 

4 9 I? !6.9 9.3 



B 


A 


% 


% 


9.6 


3.1 


18.6 


. 7.2 


18.1 


7.9 


16.7 


16.9 


37.0 


64.9 



19.7 



; Variable Proof D 

ProofGrades Similar Triangles 

/ ^ C B A All 

? .1^6 ?3 ?0 ,7% 

Proof ^ 53.4 413 2^ ^i? 

Scores I II 27.4 31.1 23.2 l[| 

4 26-0 38.4 15.6 

•'•9 14.2 28.2 8.0 
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Appendix F 

(Sample School System Disaggregate Report) 
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REGICXI 
SYSTEM 



mmn caholima end-of-coursi: testing pkogiviM 

GECWETIIY moors 1989 

SYSTfii REPORT 



PAGE 1 



VAIUAOLE PROOFS 



NUMBER 
TESTED 



PERPENDICULAR 
8ISECT0R 



THREE 
DIMENSIONAL 



SCORE POINTS 



ALL STUDENTS TESTED 

STATE 

RKGIC?! 

SYSTEM 



0 12 3 4 0 1 



43926 
7279 
851 



7 26 26 20 22 
7 20 26 20 27 
6 16 28 29 21 



16 43 32 
14 34 42 

9 30 45 



8 
10 
13 



PARALLEL 
LINES 



29 30 13 9 20 
29 27 ' 10 22 
23 29 . 16 19 



SIMILAR 
TRIANGLF^ 



17 37 22 16 
IS 36 20 19 
11 44 ?G 15 



comou psooF 



SEX 
KALE 



FEMALE 



STATE 

REGION 

SYSTEM 

STATE 

REGION 

SYSTEM 



19291 
3242 
385 

22799 
3761 
433 



24 25 21 22 
20 24 21 29 
12 29 32 24 

26 26 20 22 
20 28 19 27 
19 26 30 18 



15 40 34 9 

13 32 43 11 
B 29 50 11 

14 45 3! 8 
13 36 40 10 
10 31 42 16 



26 27 14 11 22 
24 23 13 13 27 
30 22 9 20 19 



29 32 13 

30 29 12 
18 33 IB 13 



9 18 
8 20 
18 



17 ?6 22 16 
14 34 20 22 
12 39 29 17 

16 38 22 16 
lA 38 19 18 11 
10 4U 22 14 



A 


n 


n n 












8 


8 


5 


10 


9 


11 


9 


13 


7 


18 


10 


7 


5 


18 


9 


11 


8 




8 


21 


5 


6 


4 


17 


Q 


12 


6 


17 


9 


20 


9 


8 


5 


17 


9 




9 


13 


8 


19 


9 


6 


4 


16 


B 


11 


9 


14 


Q 


£j 


3 


, 4 


4 


17 


6 


14 


g 


16 


q 


0'> 


9 


8 


5 


20 


10 


11 


B 


13 


7 


17 


11 


•> 

r 


4 


19 


10 


11 


g 


14 


7 


<\j 


7 


G 


3 


18 


11 


11 


9 


16 


3 


1 fl 


7 


12 


4 








7 


13 


6 


12 


8 


13 


0 


23 


20 


13 


5 


10 


3 


15 


0 


0 


0 


0 


0 


0 


• a 


n 
U 


U 


0 


4 


11 


8 


24 


11 


12 


7 


10 


5 


11 


2 


13 


8 


27 


13 


11 


5 


9 


5 


9 


0 


5 


5 


2"* 


15 


15 


S 


15 


5 


10 


5 


10 


6 


21 


11 


12 


8 


12 


6 


14 


4 


9 


5 


71 


11 


12 


9 


13 


6 


14 


U 


/ 


5 


19 


10 


13 


8 


15 


9 


i4 


0 


7 


5 


18 


9 


11 


9 


14 


8 


20 


3 


5 


4 


16 


8 


11 


9 


15 


8 


25 


4 


5 


3 


17 


e 


12 


8 


17 


G 


22 



PARENTAL EDUCATION 
LESS THAN 8tH STATE 
REGION 



249 
40 

SYSTEM 1 



9 25 33 
H 29 57 



7 25 
0 0 



8TH TO 12TH 



HIGH SCIKX)L 



STATE 2466 

REGION 386 

SYSTEM 20 

STATE 9953 

REGION 1514 

SYST£M 133 



MORE THAN 12TH STATE 29188 
REGIO{{ 5013 
SYSTEM 665 



8 35 26 17 14 
11 37 28 12 13 
14 29 14 29 14 

7 29 28 18 18 

8 25 30 18 20 
7 17 29 33 14 

6 23 25 22 24 
4 17 25 22 32 
4 15 28 31 23 



27 43 28 
33 44 72 



19 50 26 
21 38 36 
0 0 0 

17 47 29 
15 39 38 
21 34 38 



28 46 
40 30 



10 



13 
10 



14 40 34 10 
11 32 43 17 
7 30 47 14 



0 
0 
0 

1 

0 
0 

2 
2 
2 



37 35 
42 35 
57 79 

31 34 
36 32 

38 35 



11 
11 
14 

13 
12 
12 



11 
0 
0 

14 

12 
0 



20 47 10 
8 67 0 
0 0 

23 45 19 
29 48 15 
0 80 20 

19 42 21 
18 44 17 
8 58 IV 



15 
17 
0 

9 
5 
0 

13 
17 
8 



25 28 14 ir73 
2.1 24 13 12 78 
19 27 15 18 21 



15 35 23 17 
12 37 71 72 
17 39 7/ 18 



^"^ GEOMETRY PROOFS TEST HERE ADHlNlSTERFi) IN EACH CIJVSCROOM EACH STUDEMT 
TOOK C0H:40N PROOF mO ONE OF FOUR VARIADLE PRCOKS. THE NUHUFRS W TH K TAULE REP^^ 
^f^^c"S^ ^^"^^"^^ ATTAINING EACH SCORE POINT. 100% IS REPRESEl^r^^^ FOR 
ALL STUDENTS TESTED HERE OBTAINED DIRECTLY FROM THE SCORK DATA. PERCENTAGES IIY SUlXiROUP 
WERE OBTAINED FROM DATA COOED ON THE MULTIPLE-CHOICE ANSWER SHEET -UIX.ROUP 



REGION 
SYSTEM 



NORTH CAROLINA ENO-OF-COURSE TESTING PROGRAM 

GEOMETRY PROOFS 1989 

SYSTEM REPORT 



PACE 2 



Nl'HBER 
TESTED 



PERPENDICULAR 
BISECTOR 



VARIABLE PROOFS 

THREE • PARALLEL SIMILAR 

DIMENSIONAL LINES TRIAIJGLES 



SCORE POINTS 



0 12 3 4 0 12 3 4 



0 1 2 3 4 0 1 2 3 4 



GRADE IN SCIKX)L 



COMMON PROOF 
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3T4'o 



NINE 


STATE 

REGION 

SYSTEM 


7820 
1532 
157 


2 8 18 29 43 
2 6 15 23 49 
0 3 23 38 38 


TEN 


STATE 

REGION 

SYSTEM 


19998 
3186 
400 


6 23 27 21 22 
4 19 27 21 28 
6 H 24 35 21 


ELEVEN 


STATE 

REGION 

SYSTEM 


11103 
17S7 
21C 


10 37 28 14 11 

11 30 32 13 14 
7 27 39 16 11 


TWELVE 


STATE 

REGION 

SYSTEM 


3162 
510 
45 


12 37 29 14 7 
V 34,39 11 9 
0 36 45 18 0 


OTHER 


STATE 
RKGlOr^ 


109 
47 
7 


6 16 13 19 45 
9 0 0 18 73 
0 0 0 «• 0 



4 


27 


45 




c 


17 


49 


26 ? 


3 


11 


47 


3/ 3 


14 


45 


33 


7 1 


11 


35 


44 


8 2 


8 


32 


40 


11 0 


22 


50 


24 


3 0 


17 


47 


32 


4 0 


7 


42 


44 


V 0 


24 


48 


26 


3 0 


23 


35 


AO 


2 0 


38 


38 


25 


0 0 


0 


15 


37 


26 22 


0 


7 


2'J 


43 ?l 


0 


U 


50 


U 50 



10 15 13 17 45 
10 13 11 18 48 
2 7 18 34 39 

25 31 15 10 19 
24 29 15 10 22 
10 33 17 14 19 



40 36 11 6 
44 33 11 6 
39 32 12 10 



41 38 11 
44 34 10 
50 50 0 



4 


18 


26 


32 


21 


2 


1 


7 


5 


19 


22 


34 


20 


2 


1 


7 


0 


17 


34 


31 


17 


1 


1 


3 


14 


30 


25 


15 


8 


6 


5 


18 


12 


35 


23 


19 


11 


5 


4 


16 


8 


39 


33 


16 


4 


4 


4 


14 


25 


48 


16 


8 


2 


13 


8 


26 


23 


49 


15 


11 


2 


11 


7 


27 


21 


60 


10 


9 


0 


10 


3 


31 


33 


44 


13 


8 


2 


16 


9 


26 


23 


50 


10 


13 


3 


14 


7 


26 


0 


83 


17 


0 


0 


13 


13 


31 



5 


9 


8 


16 


13 


38 


5 


8 


7 


16 


13 


41 


5 


8 


4 


20 


14 


45 


10 


12 


9 


14 


8 


18 


9 


12 


9 


16 


8 


21 


8 


13 


11 


18 


9 


19 


11 


12 


8 


10 


4 


8 


12 


12 


e 


11 


4 


8 


11 


13 


7 


13 


5 


7 


11 


11 


7 


9 


4 


7 


10 


13 


-? 


10 


4 


8 


9 


18 


7 


7 


2 


0 



n 

0 



5 5 11 58 
0 11 11 CI 
0 0 0 



13 27 13 17 30 
9 9 9 IH 55 
0 0 0 0 



10 7 6 
9 2 0 
14 0 0 



6 
2 
0 



7 6 15 40 
9 4 15 60 
29 14 14 29 



ETHNIC GROUP 
AMER. INDIAN 



BLACK 



WHITE 



OTHER 



STATE 


436 


11 


31 


37 14 


6 


16 


67 


16 


1 


0 


REGION 


25 


40 


0 


40 20 


0 


9 


64 


27 


0 


0 


SYSTE^? 


0 
















STATE 


10089 


10 


36 


27 13 


13 


23 


49 


23 


4 


0 


REGION 


2134 


11 


31 


30 14 


14 


23 


41 


30 


5 


0 


SYSTEM 


206 


9 


26 


29 19 


1/ 


16 


40 


32 


12 


0 


STATE 


30681 


5 


21 


25 23 


25 


12 


40 


36 


10 


2 


REGION 


4669 


3 


15 


25 23 


34 


8 


30 


47 


13 


2 


SYSTEM 


582 


4 


11 


20 30 


22 


/ 


20 


49 


14 


2 


STATE 


854 


6 


17 


24 20 


34 


10 


38 


32 


17 


w 

4 


REGION 


164 


3 


8 


22 11 


57 


0 


37 


37 


21 


5 


SYSTEM 


29 


0 


20 


20 40 


20 


0 


0 


75 


25 


0 



36 39 9 8 8 24 51 17 



42 37 
44 33 

37 35 

23 ?8 
20 M 
19 25 

14 25 
3 14 

0 27 



9 
10 
17 

15 
14 

14 

11 
11 

9 



11 23 
13 29 

18 23 

15 34 

19 53 
27 36 



27 44 17 
26 47 14 
1/ 50 24 

13 35 24 
9 32 22 
9 42 25 

15 27 24 
13 21 2! 
13 25 30 



8 
10 

7 



11 9 24 12 14 
20 16 16 8 12 



9 8 
8 12 



10 
24 
19 

24 
3C 
13 



10 
13 



10 
10 
13 



14 


8 


27 


11 


11 


8 


9 


4 


8 


12 


7 


26 


12 


11 


9 


10 


4 


6 


9 


8 


24 


13 


14 


8 


10 


5 


10 


6 


4 


16 


9 


11 


9 


14 


8 


21 


4 


3 


14 


7 


11 


9 


16 


9 


27 


5 


2 


15 


7 


12 


9 


19 


10 


23 


7 


4 


13 


7 


9 


8 


15 


11 


25 


4 


4 


y 


7 


IC 


5 


18 


12 


33 


3 


0 


17 


7 


10 


3 


't 
t 


7 


34 



WOTE: 



ip°!!^!;SMl^.'''.l''!e!^"«'_!:!!?°" TEST HEIIE ADHrHISTERED IN FAOl CUSSKOOH. 



EACH STUDENT 
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REGION 
SCHOOL 
CODE 
SYSTEM 

TEACHER 



N C END-OF-COURSE TESTING PROGRAM: 1988-89 
CLASS ROSTER FOR GEOdETRY PROOFS 

6 

ANSON JUNIOr H.S. 
40304 

ANSON COUNTY SCHOOLS 



RANDALL P 



CLASS PERIOD 1 



NOTE: CODE THE PROOF SCORES ON THE APPROPRIATE STUDENT ANSWER SHEETS ACCORDTMP 

™.^;5nJ2^^^'°^^ °^ ^^^^2 " * 14 0^ TEST ADMINISII^OR^S fSr 
THE GEOMETRY TEST. CODE THE COMMON PROOF SCORE IN COLUMNS K AND L THE 

1^ COLUMNS M AND N, AND THE FORM IN COLUMN 0. ONLY THE COMMON 
v«™^^°^ "SED IN DETERMINING STUDENT GRADES. THE VARIaSe pSs 

VARIED IN DIFFICULTY AND WILL BE USED FOR SCHOOL AND SCHOOL SYST^^POR^lSS 
FOR YOUR INFORMATION, THE PROOFS FOR EACH FORM WERE AS FOLLOWS' ^^O^^^NG. 
A-PERPENDICULAR BISECTOR, B=THREE DIMENSIONAL, C=PARALLEL LINES D=SIMILAR TRTAMripq 
THE STATEWIDE SCORE DISTRIBUTIONS FOR ALL PROOFS A^^N bS TRIANGLES. 

f^°°^ ^ ^ASED ON TWO INDEPENDENT READINGS WHICH PRODUCE SOME 

MID-POINT SCORES. VARIABLE PROOF SCORES ARE BASED ON ONE READING. 



STATEI-JIDE DISTRIBUTION OF SCORES 
PROOF 0.0 0.5 1.0 1.5 2.0 2.5 3.0 



3.5 4.0 TOTAL 



CCMION 
A 
B 
C 
D 



8.4% 
6.7% 
15.5% 
28.6% 
17.3% 



5.5% 



19.2% 
25.7% 
43.4% 
29.7% 
37.5% 



9.4% 



11.1% 
25.5% 
32.4% 
12.7% 
21.6% 



8.6% 



12.9% 
20.3% 
7.7% 
9.4% 
15.6% 



7.2% 



17.6% 
21.7% 

1.0% 
19.7% 

8.1% 



43926 
11177 
11017 
10925 
10807 



STUDENT 



COlIMON 
PROOF 



lORM 



VARIABLE 
PROOF 



ERIC 



2.5 

1.0 

4.0 

3.5 

1.5 

1.0 

1.0 

3.5 

4.0 

2.0 

3.0 

1.5 

0.0 

2.0 

2.0 



A 
A 
D 
B 
A 
B 
C 
C 
D 
A 
0 
C 
B 
D 



3.0 
3.0 
2.0 
3.0 
3.0 
2.0 
4.0 
4.0 
3.0 
2.0 
3.0 
2.0 
2.0 
4.0 
1.0 
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Appendix G 
(Teacher Survey Data) 
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Responses to Evaluation of Geometry Field Test 



Question 1: Test Administration 

in one period? 



Question la: Was there sufficient time to do two proofs 

Responses to la: 



Unclassifiable response 



0.7% 



Yes -with conditions 




No 0.3% 



98.0% 



Response 
No 

Yes- with conditions 
Yes 

Uhclassifiable response 
Missing 



Summary of la: 



Count 

1 

3 

297 

2 

12 



singi™Tr?te" Ucti^Z fonZlTfr'''' ^"Ponse was focused more directly at a 

Thfre was oWonc'^^^Z'^X^^^ ^«^P°"^« ~ed with administnative d^es. 
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ERIC 



Question lb: Should all testing be done first period? 

Responses to lb: 



Don't know ^ 



2.1% 



Unclassifiable response 4 2% 




39.4% 



Response 

Don't know 
Unclassifiable response 
Other 
Yes 

Yes - with conditions 
No - with conditions 
No 

Missing 



Number 

6 

12 

52 

113 

29 

22 

53 

28 



Summary of lb: 

differeSSSls?teTc£ any other response. However, the .rA^rresponse showed a 

all smdenS ov;r the st^te ^^^^^^.^^ ^fing the same period for 

day already loiew ^Hch p^^^^^^ ^" t''''}^' ^" 



ERIC 
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Question Ic: Is the end of March an appropriate time? 
Responses to Ic: 

Yes - with conditions 
No - with conditions 
No 



Response 
No 

No--with conditions 
Yes-with conditions 
Yes 

Missing 




60.0% 



m 



§ 4.9% 




m 



27.7% 



Number 

79 

14 

21 

171 

30 



Summary of Ic: 

not h^v^^nZ^^ni^.t ^"^'^ T """^ r"P°"ses as any other answer. In general, the affirmative responses do 

S^iaSv are fcw^^^ ^11^'''°" 'T' "^^P^"''" "-"P^^S", on the other hand, 

usually are followed by reasons; in this case, the no responses were reponed because the test was given later 
ihm the teachere thought it should have been. Many teachers explainedUiat they taught proXntS^^^^ 

testing conditional response was due to conflicts with spring break and other End of Course 



ERIC 
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Question Id: Did students reveal their true abilities? 
Responses to Id: 
Response 

Don't know 



Yes 

Yes - with conditions 
Maybe 
No - with conditions 

Rerponse 
No 

No-with conditions 
Maybe 

Yes-wiih conditions 
Yes 

Don't Know 
Missing 




42.7% 



Number 

26 
^1 
9 

49 
109 
41 
60 



Summary of Id: 

anH •Wo&^h;'^"^ ^'""^ ^^^"^ response. TTie responses with phrases such as "most did" 

and probably did werc categonzed as yes-with conditions. Also in this categoiy are the resoonses w th 
background interference" such as nervousncw. Many teachers wrote that they w^uYd no? SowTe an wer to 

stuaents apathy, nervousness, and, from the response to question lb, the apparent cheating. 
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Question 2: Proofing Skills 



Qu«.ion 2a: Is i, important to test "proonng" i„ addition to multiple choice testing? 
Responses to 2a: 



Response 

Don't know 

Yes 

Yes - with conditions 
No 

No - with conditions P 



3.8% 




73.1% 



Response 
No 

No-with conditions 
Yes~with conditions 
Yes 

Don't Know 
Missing 



4.5% 



Number 

39 

12 

10 

193 

10 

51 



Summary of 2a: 



large?/o'";fpon":S 

not wonh it " Manv teachei^ 3a?<ff^H^ n ^""tion but did answer question 2b with resoonses such as "v\ 
statements and reasons ' ^"^^"'^ ^" °f ^ "'"'"P^^ choice test .^ith choices of 



ERIC 
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Qu«.io„ 2b^fc i,^„„r,h ,h. General Assembly providing $3 per stud.nt or would sampling be 

Responses to 2b: 
Resnnnse 

No 



No- with conditions ^ 



Yes 

Don't know 



Response 
No 

No-wiih conditions 
Yes 

Don't Know 
Missing 



Number 

94 

14 

103 

30 

74 



39.0% 




42.7% 



Summary of 2b: 

whe to oT^Tit was Slllr^'il;^ !r ' ^T'""^ i" '? Ma"y did no. know 



ERIC 
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Question 2c: Have there been any benefits beyond scoies for students and curriculum 
L^TO^-iea'L?!?'"'""^ ' standa^drdllrssion 

Response to 2c: 

Response 

No 
Yes 

Yes - use of method in class 
Yes - agreement on standards 
Yes - teacher discussion 
Yes - other 
Yes - combination of above 
Don't know 




17.5% 



Response 



Number 



No 17 
Yes 48 
Yes - use of method in class 52 
Yes - agreement on standards 19 



Yes - teacher discussion 
Yes - other 

Yes - combination of above 

Don't Know 

Missing 



19 
16 
38 
8 

98 



Summary of 2c: 



the m^tldJnrit!^a^^ ^aI^ "^^ ''^ "'^'^'^ ^^P°"^ ^« "^ost oftcn. Tcachcrs generally liked 
the method for scoring proofs. Abiost everyone who responded agreed 'hat the method was useful Another 

cSulur'^ standards-:m,ny teafhers did not jS^expectaS o^^^^^^^ 
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Question 3: Scoring of Proofs 
Question 3a: Should lead, practicing teachers continue to be scorers? 
Responses to 3a: 

ResDon<;^ 

No 

Yes 

Yes - use of method in class 
Yes • agreement on standards ^ 
Yes - teacher discussion 
Yes . other 
Yes - combination of above 
Don't know 




24.0% 



Response 
No 

No-wiih conditions 
Yes-wiih conditions 
Yes 

Unclassifiable response 
Missing 



Number 
14 

-T 

6 

241 

14 

36 



Summary of 3a: 

ommtg^M^S^^ '^T""' ^° ^ ofihos, teachers in the no 

quesS SSSS .11 a P^^«^^'°"^gro"P of readers. This question was often answered by the next 
r.nn« -rLi^ indicated that all geometry teachers should have the experience of gradine oroofs in thi«; 



ERIC 



to 



Question 3b: Should different teachers score each year? 

Response 

No 

No . with conditions 
Maybe 

Yes - with conditions 
Yes 

Unclassifiable answer 
Don't know 

Responses to 3b: 
Response 
No 

No~with conditions 
Maybe 

Yes~widi conditions 
Yes 

Unclassifiable Answer 
Don't Know 
Missing 




Number 

83 
25 
18 
24 
73 
26 
6 

60 



32.5% 



Sununary of 3b: 



abom ouSv n7t3fnl'^^^ °? Posi^ve r«PO"ses were based on concerns 

eS£! ^ '^^"^"'^"^ and «uiyZ;£ category represent a large number of unsure 



ERIC 
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Question 3c: Can scnring move to school systems in the next five years? 
Responses to 3c: 
Response 



No - with conditions 
Maybe 
Yes -with conditions 



Unclassifiable response 

Don't know 2.3% 




Response 
No 

No-with conditions 
Maybe 

Yes~with conditions 
Yes 

Unclassifiable response 
Don't Know 
Missing 



Number 

59 

17 

8 

10 

108 

13 

5 

V5 



49.1% 



S' ramaiy of3c: 
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Question 3d: Should we use school days or weekends if scoring is done at the regional level? 
Responses to 3d: 

School days 



Weekends 



Unclassifiable response 



Response 




79 3% 




Number 



School Days 203 

Weekends 3g 

Unclassifiable response 13 

Missing 59 



Summary of 3d: 



«,.iSc "^^^ """^^ P°P"^^ ^swer to this question. Some were stronsiv oDDosed to 

ltlS'^!Z':^^^^^^ ^^"g ^^'y '^I^^^ "me; fhese JlaSans^^^^^^ 

S^vTrSsfsSfn''^^ '^"^'^'y'- of reimbursement for this scoring came up 



73 
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Question 3e: Is this scoring regular duty or extra duty? 
Responses to 3e: 
vRespongg 



Regular duty 



Extra duty 




15.9% 




Unclassifiable response 



Response 




5.6% 



Number 



Regular duty 49 

Extra duty 1 93 

Unclassifiable Response 14 

Missing 63 



Summary of 3e: 

Extra duty was the winner here. Again, many included payment as an issue. 



74 
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Question 4: Training on Holistic Scoring 



Question 4a: Rate the awareness and scoring sessions compared to other staff development. 
Responses to 4a: 



Response 



Worse than other 



Comparable 



Better than other 



Response 

Worse than other 
Comparable 
Better than other 
Missing 



1.9% 



m 




n.5% 




Number 

2 
13 
85 
211 



81.7% 



Summary of 4a: 

Most teachers did not respond at all to this question. But those who did respond responded favorably. 
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Question 4b: Row often should a person receive ,his training- once, every three years, never? 

Response to 4b: 



Once 
Once a year 
Once every 2-3 years 




Other Wl 2.0% 



21.1% 



33.1% 





43.8% 



Response 
Once 

Once a year 

Once every 2-3 years 

Other 

Missing 



Number 

53 
83 
110 
5 

64 



Summary of 4b: 

intens^'^nbTsessioJ'^ P^V^lf ^^^P°"^«- ^ lot of teachers favored one 

aSpSed was b L period. One suggestion as to how this plan could be 
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Question 4c: Have you (l)used this technique in class- (2)has it improved instruction? 
Response to 4c(l): 



ResDonsg 




24.1% 




Response 

No 
Yes 

Missing 
Response to 4c(2): 
Response 



No - with conditions 
Yes 

Unclassifiable response 
Don't knovv' 

Response 
No 

No-with conditions 
Yes 

Unclassifiable response 
Don't Know 
Missing 

Sunimaiy of 4c: 



Number 

51 

161 

103 



Number 

17 
6 

100 

10 

5 

177 



75.9% 




72.5% 



many eoTne S S/Il™'^ • ' '° P'*^^' '^^^ ^°se teachers who responded no, 

waT£Tat?in^ t^^^^^^^^ ^^^^ f T^^"" "° ^^^'^hing proofs beacaSse the t« 



ERIC 



15 
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Question 4d: Would you trust scores from others? 
Response to 4d: 
Respon.^g 



No 

Yes • with conditions 
Yes 

Don't know 



6.1% 



2.5% 



Response 
No 

Yes-with conditions 
Yes 

Don't Know 
Missing 



Number 

4 
12 
177 
5 

117 




89.4% 



Sununary of 4d: 

A typical response- "yes, if trained as I." 



ERIC 



Question 4e: Should awareness and/or scoring be continued? 
Response to 4e: 

Response 



No ^ 4.5% 




UncJassifiable response p 32% 

i 



Response 



Number 



No 7 

^es 143 

Unclassifiable response 5 

Missing 160 



Sunimary of 4e: 

Half of the teachers left this question blank. The response was almost all positive. 



FRir 



17 
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Question 5: Grading 

Question ^a, percenlaee of a "geometry" total score should a single focused holistic 



score count? 

Responses to 5a: 
Resnnnsp 

0-4% 
5-10% 
11-15% 
16-20% 
21-25% 



Unclassifiable response P 



Don't know 

Response 

0-4% 
5-10% 
11-15% 
16-20 % 
21-25 % 

Unclassifiable response 
Don't Know 
Missing 




Number 

14 

140 

31 

61 

17 

15 

9 

28 



Summary of 5a: 

10 % was by far the most common answer. 



48.8% 



Responses lo 5b: 



Students take geometry 
too early 

Proofs are difficult 

Teachers don't prepare 

students well ^^^^^^""^-^ 

Other - don't understand 
Don't know 

Response 

Students take geometry 

too early 
Proofs are difficult 16 
Teachers don*t prepare 6 

students well 
Other- Don't understand 39 

question 
Don't Know 8 
Mis-ang 




25.2% 



25.2% 



Summary of 5b: 



p?S Sese cl^^^^^^^^^^^ P'"'^' ^he advanced students would be taught and tested on 
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Question 6: Other 



Ques.,o„ 6. Wha, would or do ,o„ p,a„ ,o ..,1 jour principal, 



about this effort? 
Responses to 6: 

Respon<fe 
Not worth time & }y 

Other negative opinions 

Beneficial to students 
&/or teachers 

Other positive opinions 
Facts about sessions 
Other 

Response 



superintendent, or legislator 




Number 



Not worth lime &/or money 23 

Other negative opinions 2 1 
Beneficial to students &/or teachers 1 6 

Other positive opinions 95 

Facts about sessions 7 
Other 
Missing 



51 
102 



Summary of 6: 

shouW cSuTlS^^^^^^^^^ °" ^"P°"^'^ ^hat this testing 

very exhausting bu?woXe benefi,?fo^^^^^^^^^ '° ' ^1"""°" ^'^^ scoring wf s 

due to the scor&g and due to testing as a whofe ' ^^P""'^ ^"^^'^"^^ 



ERIC 



82 

20 



